Versatile Tiled-Processor Architectures: The Raw Approach
نویسندگان
چکیده
Recent advances in VLSI technology have created an increasing interest within the computer architecture community to build a new kind of “general purpose” processor that is able to run a broad class of applications including primarily those from the domain of embedded systems—graphics, wireless processing, networking, and various forms of signal processing. The interest in new architectures is compounded by a growing wire delay concern which limits the distance that information can travel in a single clock cycle. The realities of interconnect delay—and power consumption— seriously challenge the ability of microprocessor designers to fulfill the promise of Moore’s Law. As a result, new architecture designs are largely centered around scalable and distributed alternatives to current centralized microprocessor designs. Several projects such as VIRAM [2] at Berkeley, Smart Memories [4] at Stanford, TRIPS [5] at UT-Austin, Raw [8] and SCALE [3] at MIT, and industrial efforts such as the Tarantula [1] extension to Alpha, have proposed architectures that organize silicon resources more effectively and as tiled-processors that are easily scalable. The DARPA program in Polymorphic Computing Architectures is also a research thrust in this new area, and emerging “polymorphic” architectures will eventually compete with traditional desktop processors (e.g., Pentium IV) not so much in better performance on desktop workloads, but in versatility, or the ability to run a broader class of applications more effectively. We also expect that architectures that are more versatile are also likely to run complex real-world applications more effectively, since complex applications are often comprised of diverse components. One such versatile, tiled-processor architecture (TPA), is the Raw microprocessor which was designed and implemented at MIT. Raw divides the chip into a two-dimensional mesh of sixteen programmable tiles, and interconnects them through on-chip, point-to-point scalar operand networks (SON) [7]. The Raw processor can issue sixteen different floating-point, integer, load, store, or branch instructions each cycle. It also has a large set of registers and a distributed memory hierarchy. The SON is exposed to the Raw compilation infrastructure which orchestrates the flow of data within the network for streaming computation and fine-grained instruction-level parallel-processing. The focus on TPAs and architectural versatility necessitates new benchmark suites and metrics to accurately reflect the goals of the architecture community. Toward that end, we propose both a new benchmark suite—VersaBench—and a new metric called Versatility. VersaBench is a collection of applications from three central tiers—desktops, servers, and embedded systems—encompassing traditional integer workloads, floating-point and scientific applications, server computing, stream processing, and bit-level computation. VersaBench thereby attempts to better characterize the broad set of workloads that the new tiled-processor architectures are required to run. The Versatility metric is inspired by SPEC rates [6]. For example, the SPEC CINT89 rate for an architecture is the geometric mean of the speedups of that architecture relative to a reference machine (specifically, the VAX 11/780)1 for each of the applications in the SPEC CINT89 suite. Computing the Versatility of an architecture is purposefully designed to mirror that of SPEC rates. Accordingly, like SPEC, Versatility takes the geometric mean of the speedups of an architecture for each of the applications in the VersaBench suite. Unlike SPEC rates however, the speedup of each application is not computed relative to a single reference machine, but rather relative to the architecture which provides the best performance for that application (in the 2004 time frame from known results at the time of this writing).
منابع مشابه
Mapping and Performance of Kernel Benchmarks for Tiled Architectures1
Many new computer architecture designs consist of multiple tiles. We define kernel benchmarks and metrics to be used to evaluate tiled architectures for DoD applications, and examine the performance of a prototype tiled architecture, the RAW processor from MIT, on these kernel benchmarks. We compare the performance of RAW to the performance of an embedded system based on the PowerPC G4 and a se...
متن کاملTolerating SEU Faults in the Raw Architecture
This paper describes software fault tolerance techniques to mitigate SEU faults in the Raw architecture, which is a single-chip parallel tiled computing architecture. The fault tolerance techniques we use are efficient Checkpointing and Rollback of processor state, Break-pointing, Selective Replication of code and Selective Duplication of tiles. Our fault tolerance techniques can be fully imple...
متن کاملShort Range Wireless Connectivity for Next Generation Architectures
This paper illustrates preliminary findings in our ongoing efforts on integrating Raw, a proposed scalable tiled processor architecture, with the robust wireless connectivity provided by BlueTooth (BT), one of the most popular short range wireless connectivity standards. Our work of integrating Raw and BT is primarily motivated by two ideas. First, we are interested to evaluate the possible per...
متن کاملGigabit IP Routing on Raw
Network processors afford a great degree of flexibility to current day routers, yet they still have followed the trend of being largely specialized to the domain of route table look-up. By using a processor architecture that is more general purpose, routers can gain from economies of scale and increased programmatic flexibility. We propose the use of the Raw Processor [5] as both a network proc...
متن کاملEnergy Scalability of On-Chip Interconnection Networks in Multicore Architectures
On-chip interconnection networks (OCNs) such as point-to-point networks and buses form the communication backbone in systems-on-a-chip, multicore processors, and tiled processors. OCNs can consume significant portions of a chip’s energy budget, so analyzing their energy consumption early in the design cycle becomes important for architectural design decisions. Although numerous studies have exa...
متن کامل